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Abstract 

Green, Tao and Zieglcr [GT, TZ] prove "Dense Model Theorems" of the following form: if 
i? is a (possibly very sparse) pseudorandom subset of set X, and D is a dense subset of R, then 
D may be modeled by a set M whose density inside X is approximately the same as the density 
of D in R. More generally, they show that a function that is majorized by a pseudorandom 
measure can be written as a sum of a bounded function having the same expectation plus a 
function that is "indistinguishable from zero." This theorem plays a key role in the proof of 
the Grcen-Tao Theorem [GT] that the primes contain arbitrarily long arithmetic progressions. 

In this note, we present a new proof of the Green-Tao-Ziegler Dense Model Theorem, which 
was discovered independently by ourselves [RTTV] and Gowers [Gow]. Our presentation fol- 
lows the argument in [RTTV] (which in turn was inspired by Nisan's proof of the Impagliazzo 
Hardcore Set Theorem [Imp]), but is translated to the original notation of Green, Tao, and 
Ziegler. 

We refer to our full paper [RTTV] for variants of the result with connections and applica- 
tions to computational complexity theory, and to Gowers' paper [Gow] for applications of the 
proof technique to "decomposition, "structure," and "transference" theorems in arithmetic and 
extremal combinatorics (as well as a broader survey of such theorems). 

1 The Green-Tao-Ziegler Theorem 

Let X be a finite universe. We use the notation Kxex fix) := Ylix^x /(^)- ^^o^' functions 
/, : X ^ M we define their inner product as 



{f,9) ■■= E f{x)g{x) 

x<^X 

'Faculty of Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel. 
omer.reingold@weizmann.ac.il. Research supported by US-Israel Binational Science Foundation grant 2006060. 

^Computer Science Division, U.C. Berkeley, luca@cs.berkeley.edu. Work partly done while visiting Princeton 
University and the IAS. This material is based upon work supported by the National Science Foundation under grants 
CCF-0515231 and CCF-0729137 and by the US-Israel Binational Science Foundation under grant 2006060. 

■''Computer Science Division, U.C. Berkeley, madhurt@cs.berkeley.edu Work partly done while visiting Princeton 
University. This material is based upon work supported by the National Science Foundation under grants CCF- 
0515231 and CCF-0729137 and by the US-Israel Binational Science Foundation under grant 2006060. 

^School of Engineering and Applied Sciences, Harvard University, salil@eecs.harvard.edu. Work done during a 
visit to U.C. Berkeley, supported by the Miller Foundation for Basic Research in Science, a Guggenheim Fellowship, 
US-Israel Binational Science Foundation grant 2006060, and the Office of Naval Research grant N00014-04- 1-0478. 



1 



A measure on X is a function g : X ^ M. such that g > and Kxex g{x) < 1- A measure g is 
bounded if g < 1. 

Let J- he a cohection of bounded functions f : X ^ [—1, 1]. We say that two measures g,h are 
e-indistinguishable according to J- if 

yf eT.\{g-h,f)\<e 

(It can be noted, although this fact will not be used, that if we define \\g\\j^ = maxjgjp \{g, f)\, then 
II • IIjf is a semi-norm, and we have that g and h are e-indistinguishable if and only if \\g — h\\yr < e. 
Hence the notion of indistinguishability may be seen as a semi-metric imposed on the space of 
functions X — > M. If ^ contains all bounded functions / : X — > [—1, 1], then || • ||jf is the standard 
Hi norm.) 

We say that a measure g is e-pseudorandom according to T \i g and Ix are e-indistinguishable 
according to where Ix is the function that is identically equal to 1. 

If is a collection of bounded functions / : X — > [—1,1], we denote by J^^ the collections of all 
functions of the form HiLi/jj where fi G J- and k' < k. In particular, if T is closed under 
multiplication, then = !F. 

Theorem 1.1 (Green, Tao, Ziegler [GT, TZ]) For every e > 0, there is a k = {l/e)OW and 
an e' = exp(— (l/e)*^^^^) such that the following holds: 

Suppose that T is a finite collection of bounded functions f : X [— 1, 1] on a finite set X , 
u : X is an e' -pseudorandom measure according to , and g : X is a measure such that 
9 

Then there is a bounded measure gi : X ^ [0, 1] such that 

1- "Rxaxgiix) = "&x(ixg{x), and 

2. gi and g are e-indistinguishable according to T . 

Green, Tao, and Ziegler [GT, TZ] state the conclusion in the following equivalent form: we can 
write g = gi + 92, where gi is a bounded measure, gi and g have the same expectation, and g2 is 
nearly orthogonal to in the sense that \{g2, f)\ < e for all f ^ T . 

We now describe how the theorem can be interpreted as saying that "every dense subset of a 
pseudorandom set has a dense model", as mentioned in the abstract. From any sets D C R <Z X , 
we can obtain measures > g hy setting z/ = 1/j • |X|/|i?| and g = Id ■ \^\/\R\, where we write Is 
for the characteristic function of a set S. Then the condition that is e'-pseudorandom according 
to J- says that every function f ^ T has the same average over i? as it does over X, to within ±e', 
which is a natural pseudorandomness property of the set R. And the expectation of g is precisely 
the density of D in i?, i.e. [Dl/liJl. Now, assuming that R does indeed satisfy the foregoing 
pseudorandomness property, let g\ be the bounded function given in the conclusion of the theorem. 
Suppose for starters that g\ is the characteristic function of some set Af C X. Then Item 1 says 
M has the same density in X as has in R. And Item 2 says that D and M are indistinguishable 
from each other, in the sense that every function in T has the same average over both sets, to 
within ±e/5, where b = \D\/\R\ = |M|/|X|. So M is indeed a "dense model" of X. 

The actual theorem above can be interpreted as simply allowing all of the sets, namely D and R 
in the hypothesis and M in the conclusion, to have their characteristic functions replaced with 
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bounded measures of the same expectation. We note that, by an argument of Impaghazzo [Imp], 
ahowing the function gi in the conclusion to be a measure rather than the characteristic function of 
some set M does not substantially weaken the theorem. Indeed, given 171, we can construct a set M 
using the probabilistic method, including each element x G X in M independently with probability 
gi{x). Then, by Chernoff Bounds, M will have density at least (l — e)5 and its characteristic function 
will be 2e-indistinguishable from g according to J- with probability 1 — \J-\ ■ exp(— r2(5e^|X|)). 

2 Our Proof 

We prove the contrapositive: assuming that gi is e-distinguishable from all dense models g by 
functions in J^, we prove that v cannot be pseudorandom, i.e. it is e'-distinguishable from Ix by 
some function in J^''. 

Let 6 := Ezex gi^) and let us denote, for convenience, by G the set of "dense measures" gi : X —> 
[0, 1] such that Kgi = S. Our assumption can be written as 

\fgieG.3f eJ^.\{g-gi,f)\>e 

If we denote by J^' the closure of under negation, that is^' := J^U{—f : f € J^}, we can remove 
the absolute values: 

ygiGG3f e^'.{g-gi,f)>e. (1) 

Proof outline. Suppose that we can manage to find a gi, f pair such that the above holds and for 
which gi{x) = 1 on every point in the support of /. Then it turns out that / must also distinguish 
u from Ix- Indeed, (/, u) > {f,g), because v > g pointwise, and (/, Ix) = {f,gi)- 
We will not be able to find such a gi, f pair with f ^ T\ but we will be able to do so with a 
function / that is a convex combination of functions in T' composed with a threshold function. 
Then we show how to convert a distinguisher of such a form into a distinguisher that is a product 
of at most k functions from J^. 

In more detail, the proof will proceed in the following steps: 

1. By replacing J^' with its convex hull, we reverse the order of quantifiers in (1), and obtain a 
single / that e-distinguishes g from every bounded measure gi of expectation 6. 

2. With an appropriate choice of gi (namely, the characteristic function of the d\X\ inputs on 
which / is largest), we argue that a thresholded version of /, denoted ft, continues to ^2(e)- 
distinguish g from gi, and has support contained in gi^{l). By the above argument, ft 
r2(e)-distinguishes u from Ix- 

3. By approximating the threshold function with a low-degree polynomial that has relatively 
small coefficients, we deduce that there are at most k functions from T whose product e'- 
distinguishes u from Ix- 

Proof Details. We now proceed with Step 1, where we reverse the order of quantifiers in (1). 
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Claim 2.1 There is a function f that is a convex combination of functions from T' and satisfies: 

ygi e G.{g-giJ) > e 

Proof of claim: We use the min-max theorem for 2-player zero-sum games (which 
is a consequence of the Hahn-Banach Theorem, as used in Gowers' version of the 
proof [Gow]). We think of a zero-sum game where the first player picks a func- 
tion / E J^', the second player picks a function gi E G, and the payoff is {g — gi, f) for 
the first player, and —{g — gi, f) for the second player. 

By the min-max theorem, the game has a "value" a for which the first player has an 
optimal mixed strategy (a convex combination of strategies) /, and the second player 
has an optimal mixed strategy gi, such that 



and 



V^iGG, {g-giJ)>a (2) 



V/E^', {g--giJ)<a (3) 



Since G is convex, gi E G, and our hypothesis tells us that there exists a function / 
such that 

{g-giJ) > e 

Taking this / in Inequality (3), we get that a > e. The claim now follows from 
Equation (2). □ 

We now proceed with Step 2 of the proof. Let 5 C X be the set of 6\X\ elements of X that 
maximize /, and let gi be the characteristic function of S.^ Then gi is a bounded measure of 
expectation 6, i.e. an element of G, so we have: 

(g-giJ) > e ■ 

or, equivalently, 

{gj) > (4) 

Now, we argue that by applying a threshold function to /, we can ensure that 51 = 1 at every point 
in the support, while preserving the fact that we distinguish g from gi. Specifically, for a threshold 
t, define ft'-X—> {0, 1} to be the boolean function such that ft{x) = 1 if and only if f{x) > t. We 
will show that for some value of t, ft has the properties we desire. Moreover, it will be important 
for the final step to argue that the threshold is "robust" in the sense that it does not matter 
what happens in a small interval around t, where the discontinuity of the threshold function could 
cause problems. (Gowers [Gow] handles this issue differently, by instead showing that there is a 
distinguisher of the form max{0, /(x) —t}, which has the advantage of being continuous everywhere 
as a function f{x).) 



^In case 5\X\ is not an integer, we define gi to be 1 on the [51X[J inputs tliat maximize /, to be on tiie 
\X\ — \5\X\'\ inputs that minimize /, and to be an appropriate fractional value on the remaining element in order to 
make the expectation of g\ equal to 5. 
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Claim 2.2 There is a threshold t G [— 1 + e/3, 1] such that 

{g,ft) > {giJt-e/s) + I 
Proof of claim: First, observe that 

fix) = ft{x)dt - 1. 

From (4) and the fact that {g, Ix) = (51, 1) = <5, we have 

(gj+l) > {9iJ+l) + e, 

which is equivalent to 

I ^{9,ft)dt> I ^{g^J-,)dt + e (5) 



Now if the claim were false, we would have 

\g,ft)dt = [ ^^''\gj^)dt+ C {g,ft)dt 

1 J-l J-l+e/3 

l+e/3 fl 



< [ {g,l)dt+ [ ({gi,ft~e/3) + ^)dt 

J~l J-l+e/3 ^ ^' 



< / (5i,/t-e/3)^^i+ 2-- •- 

< J ^{giJt)dt + e, 

contradicting Equation(5). □ 

We now argue that gi is identically equal to 1 on the support of ft-e/3- Recall gi is the characteristic 
function of the set of S\X\ inputs maximizing /. So if gi{x) < 1 for some x in the support of ft_^/-^, 
then gi{x) = everywhere outside the support of ft-e/s- But then 

{giJt-e/s) = = ^ = {g,ix) > {gjt) 

in contradiction to Claim 2.2. 
Putting everything together, we have 

{yjt) > {gJt) 

> (5i,/t-./3)+e/3 

> (lx,/t-./3)+e/3. 

Finally, we proceed with Step 3, where we find a distinguisher that is defined as a product of 
functions from rather than being a threshold function applied to a convex combination of 
elements of J^' . We do this by approximating the threshold function by a polynomial, using the 
following special case of the Weierstrass Approximation Theorem. 
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Claim 2.3 For every a,/? E [0, 1], t € [a, 1], there exists a polynomial p of degree poly(l/a, 1//3) 
and with coefficients bounded in absolute value by exp(poly(l/Q, 1//3)) such that 



1. For all z € [—1, 1], we have p{z) G [0, 1]. 

2. For all z € [— l,t — a], we have p{z) G [0,/3]. 

3. For all z G [t, 1], we have p{z) G [!-/?,!]. 

We set a = e/3 and (3 = e/12 in the claim to obtain a polynomial p{z) = Yli=o^i^^ '^^ degree 
d = poly(l/e) with coefficients satisfying |cj| < exp(poly(l/e)) and such that for every x we have 

ft{x) - ^ < {pof){x)) < /i_,/3(x) + ± 

where o denotes composition. From the properties of the polynomial p, we get 

{iy,pof)>{uJt)-^ 

and 

{lx,pof) < {lxJt) + ^, 

giving 

{u,pof>{lx,pof) + i (6) 



If the polynomial / = q/* has inner product at least e/6 with i^ — l, there must exist a single 
term Ckf^ whose inner product with — 1 is at least e/{6{d + 1)), which in turn implies that /'^ 
has inner product of absolute value at least e' := e/(4((i+ l)|cfc|) = exp(— poly(l/e)) with u — 1: 

\{^-lJ')\>e' 

Suppose that {v - 1, /^) > e'. (The reasoning is analogous in the case of a negative inner product.) 
Recall that the function / is a convex combination of functions from T' . This means that we 
may think of f{x) as being the expectation of a random variable /(x), in which the function / 
is picked according to some probability measure on J^' . Then the value f{x)^ is the expectation 
of the process where we sample independently k functions fi, ■ ■ ■ , fk as before, and then compute 
Y\i fii^)- By linearity of expectation, we can write 



e'<{J,-l,f{x)'')=E{u-hllf^{■) 



where the expectation is over the choices of the functions fi as described above. We can now 
conclude that there is point in the sample space where a random variable takes values at least as 
large as its expectation, and so there are functions fi, ■ ■ ■ , fk G such that 



Finally, replacing fi with — fi as appropriate, we can have all the functions fi be in JF itself (rather 
than T'). 
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